Performance Comparison of Gmm, Hmm and Dnn Based Approaches for Acoustic Event Detection within Task 3 of the Dcase 2016 Challenge

نویسندگان

Jens Schröder

Jörn Anemüller

Stefan Goetze

چکیده

This contribution reports on the performance of systems for polyphonic acoustic event detection (AED) compared within the framework of the “detection and classification of acoustic scenes and events 2016” (DCASE’16) challenge. State-of-the-art Gaussian mixture model (GMM) and GMM-hidden Markov model (HMM) approaches are applied using Mel-frequency cepstral coefficients (MFCCs) and Gabor filterbank (GFB) features and a non-negative matrix factorization (NMF) based system. Furthermore, tandem and hybrid deep neural network (DNN)-HMM systems are adopted. All HMM systems that usually are of single label type, i.e., systems that only output one label per time segment from a set of possible classes, are extended to multi label classification systems that are a compound of single binary classifiers classifying between target and non-target classes and, thus, are capable of multi labeling. These systems are evaluated for the data of residential areas of Task 3 from the DCASE’16 challenge. It is shown that the DNN based system performs worse than the traditional systems for this task. Best results are achieved using GFB features in combination with a single label GMM-HMM approach.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Deep Neural Network Baseline for Dcase Challenge 2016

The DCASE Challenge 2016 contains tasks for Acoustic Scene Classification (ASC), Acoustic Event Detection (AED), and audio tagging. Since 2006, Deep Neural Networks (DNNs) have been widely applied to computer visions, speech recognition and natural language processing tasks. In this paper, we provide DNN baselines for the DCASE Challenge 2016. In Task 1 we obtained accuracy of 81.0% using Mel +...

متن کامل

A Comparison of deep learning methods for environmental sound

Environmental sound detection is a challenging application of machine learning because of the noisy nature of the signal, and the small amount of (labeled) data that is typically available. This work thus presents a comparison of several state-of-the-art Deep Learning models on the IEEE challenge on Detection and Classification of Acoustic Scenes and Events (DCASE) 2016 challenge task and data,...

متن کامل

Fully DNN-based Multi-label regression for audio tagging

Acoustic event detection for content analysis in most cases relies on a lot of labeled data. However, manually annotating data is a time-consuming task, which thus makes few annotated resources available so far. Unlike audio event detection, automatic audio tagging, a multi-label acoustic event classification task, only relies on weakly labeled data. This is highly desirable to some practical a...

متن کامل

Hierarchical learning for DNN-based acoustic scene classification

In this paper, we present a deep neural network (DNN)-based acoustic scene classification framework. Two hierarchical learning methods are proposed to improve the DNN baseline performance by incorporating the hierarchical taxonomy information of environmental sounds. Firstly, the parameters of the DNN are initialized by the proposed hierarchical pre-training. Multi-level objective function is t...

متن کامل

Sound Event Detection for Real Life Audio DCASE Challenge

We explore logistic regression classifier (LogReg) and deep neural network (DNN) on the DCASE 2016 Challenge for task 3, i.e., sound event detection in real life audio. Our models use the Mel Frequency Cepstral Coefficients (MFCCs) and their deltas and accelerations as detection features. The error rate metric favors the simple logistic regression model with high activation threshold on both se...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2016

Performance Comparison of Gmm, Hmm and Dnn Based Approaches for Acoustic Event Detection within Task 3 of the Dcase 2016 Challenge

نویسندگان

چکیده

منابع مشابه

Deep Neural Network Baseline for Dcase Challenge 2016

A Comparison of deep learning methods for environmental sound

Fully DNN-based Multi-label regression for audio tagging

Hierarchical learning for DNN-based acoustic scene classification

Sound Event Detection for Real Life Audio DCASE Challenge

عنوان ژورنال:

اشتراک گذاری